Prognostic factors for squamous cervical carcinoma identified by competing-risks analysis: A study based on the SEER database

Cervical cancer has a high incidence of malignant tumors and a high mortality rate, with squamous cervical carcinoma (SCC) accounting for 80% of cases. A competing-risks model is recommended as being more feasible for evaluating the prognosis and guiding clinical practice in the future compared to Cox regression. Data originating from the Surveillance, epidemiology, and end results (SEER) database during 2004 to 2013 were analyzed. Univariate analysis with the cumulative incidence function was performed to assess the potential risk of each covariate. Significant covariates (P < .05) were extracted for inclusion in a Cox regression analysis and a competing-risks model that included a cause-specific (CS) hazard function model and a sub-distribution (SD) hazard function model. A total of 5591 SCC patients met the inclusion criteria. The three methods (Cox regression analysis, CS analysis, and SD analysis) showed that age, metastasis, American Joint Committee on Cancer stage, surgery, chemotherapy, radiation sequence with surgery, lymph node dissection, tumor size, and tumor grade were prognostic factors affecting survival in patients with SCC. In contrast, race and radiation status were prognostic factors affecting survival in the Cox regression and CS analysis, but the results were different in the SD analysis. Being separated, divorced, or widowed was an independent prognostic factor in the Cox regression analysis, but the results were different in the CS and SD analyses. A competing-risks model was used as a new statistical method to more accurately identify prognostic factors than conventional Cox regression analysis leading to bias in the results. This study found that the SD model may be better suited to estimate the clinical prognosis of a patient, and that the results of an SD model analysis were close to those of a CS analysis.


Introduction
Cervical cancer is cancer with a high incidence of malignant tumors in the female reproductive system, ranking second only to breast cancer, and it is the fourth leading cause of female malignancy, thereby representing a major threat to the health and life of women. It is reported that the average number of new cases annually is about 500,000 with about 250,000 deaths, and the number of new cases in China each year is more than 10,000 corresponding to about 25% of the global reported new cases of cervical cancer. [1,2] In 1992, the WHO announced that high-risk human papillomavirus (HPV) infection was the primary factor contributing to cervical cancer. In 1995 the International Cancer Society also proposed that HPV infection, especially long-term persistent infection of high-risk HPV, was the main cause of cervical intraepithelial neoplasia and the further development into cervical cancer. A meta-analysis showed that high-grade squamous intraepithelial lesions infected with HPV 16,18, and 45 are more likely to develop into squamous cervical carcinoma (SCC) than that infected with other sub-types of HPV. [3] The use of a bivalent vaccine against HPV has greatly reduced the incidence of cervical cancer, but this remains the most commonly diagnosed type of cancer at present in both developed and developing countries. [4] A European randomized controlled trial indicated that the application of HPV-based cervical screening Medicine could reduce the number of women, who develop invasive cancers by 60% to 70%, [5] and so it should be promoted among young women starting from the age of 30 years followed by screening every 5 years.
While the cause of cervical cancer is clear, screening and prevention apply to cervical cancer in the nonoccurrence stage. However, for patients who already have cervical cancer, there are many factors interfering with its progression that influence the cancer mortality rate. Some of the factors that have been analyzed using conventional Cox regression include basic patient characteristics (e.g., age, stage, race, and marital status), tumor characteristics (e.g., tumor stage, tumor size, invasion, metastasis, and lymph node [LN] status), and treatments (e.g., surgery, chemotherapy, radiation, and lymph node dissection [LND]). [6][7][8][9][10][11][12][13] Overall, tumor characteristics and treatments are closely related to the prognosis of the disease. However, the applied treatment is largely determined by the nature of the tumor. Surgery and radiation therapy are major treatments for cervical cancer, with chemotherapy being an adjunctive systemic therapy. Surgery and/or radiation are often applied to patients at cervical cancer stages IA2-IIA, with radical hysterectomy combined with regional lymphadenectomy being a conventional treatment for those people, and ovarian conservation of hysterectomy having positive effects in decreasing the all-cause mortality. However, for locally advanced cervical cancer, concurrent chemoradiotherapy has been a better choice. These conclusions were basically obtained from Cox regression analyses, and there have been some deviations. The present study found that a competing-risks model has its own advantages in analyzing the prognosis.
In medical practice, a longitudinal analysis does not always identify only events of interest to researchers, but also some outcomes that are not of interest. There is a competing relationship between events of interested and uninterested outcomes. For such data with competing risks, previous approaches have involved defining competing events as censored data as well. If competing risks are ignored, the traditional univariate analysis method of Kaplan-Meier marginal regression will overestimate the cumulative mortality, and the traditional Cox multivariate regression analysis may provide only poor estimates of the hazard ratio (HR). The statistics of the competing-risks model could not be analyzed in previous versions of SAS software, but now the R software provides an additional program package (called "cm prsk") that is widely used in the sub-distribution hazard function competing-risks model. Version 9.4 of SAS uses the SD model, also called the cumulative incidence function (CIF) regression model or Fine-Gray model, in conjunction with the cause-specific (CS) hazard function model to better assess the prognosis of a disease in a competing-risks model. [14] A CS model may be better suited to addressing etiological questions, whereas an SD model might be better suited to estimate the clinical prognosis of a patient. [15] Some studies have indicated that using SD and CS models simultaneously is generally the most rigorous scientific approach to analyzing competing-risks data. [16] In this study, the primary endpoint of concern was death due to squamous cervical carcinoma (DCC), and a competing event was death due to other causes (DOC). SAS statistical software (version 9.4) was used to assess the survival of patients with SCC with the aim of identifying more accurate and reliable risk factors for DCC, in order to guide clinical practice.

Data source
The analyzed data originated from the Surveillance, epidemiology, and end results (SEER) database that was been established by the National Cancer Institute. This database is supported by the Surveillance Research Program in the National Cancer Institute Division of Cancer Control and Population Sciences, and includes information on cancer incidence, treatment, and survival for approximately 30% of the US population. The SEER database contains data on cancer cases from various locations and sources throughout the US from 1973, which are comparable with the general population characteristics of the US. SEER data can be used to address multiple topics such as examining the stage at diagnosis according to race/ethnicity, determining trends and incidence rates for various cancer sites over time, and calculating survival according to the stage at diagnosis, age at diagnosis, and tumor grade or size. We utilized the Incidence-SEER 18 Regs Research Data + Hurricane Katrina Impacted Louisiana Cases, Nov 2017 Sub (1973-2015 varying) database. SEER research data are publicly available, and all patient information is de-identified, which meant that institutional review board approval was not required.

Patient selection
We identified 5591 patients diagnosed with SCC between 2004 and 2013. The endpoint of 2013 was selected to ensure that the follow-up was adequate in all of the included women. The following demographic and clinical variables were extracted: age, race, marital status, LN metastasis stage (according to the third edition of the American joint committee on cancer [AJCC]), positive lymph node, metastasis, surgery, chemotherapy, radiation, radiation sequence with surgery (RSS), LND, tumor size, and tumor grade. In the analysis we regrouped the AJCC stages into Ia, Ib, IIa, IIb, III, and IV, and categorized grades I, II, III, and IV as well-differentiated, moderately differentiated, poorly differentiated, and undifferentiated, respectively. The inclusion criteria were as follows: diagnosed between 2004 and 2013; pathological sites including the endocervix, exocervix, overlapping lesion of cervix uteri, and cervix uteri; positive histology with squamous cell carcinoma; presence of a single primary tumor; being the first malignant primary tumor. The exclusion criteria were as follows: autopsy or death certificate; positive histology with adenocarcinoma, gland scale cancer, endometrial carcinoma, and adenoid basal cell carcinoma; unknown tumor grade; unknown marital status; unknown AJCC stages; unknown cause of death; or unknown tumor size. The flow chart of the study is shown in Figure 1.

Clinicopathological factors
We extracted the following 14 factors from the SEER database: age, race, marital status, AJCC stage, LN metastasis, surgery, chemotherapy, radiation, RSS, LND, tumor size, and tumor grade. Age was a continuous variable, while the other variables were categorical variables. Race was divided into three types: white, black, and other (American Indian/AK Native, Asian/Pacific Islander). Marital status was divided into three types: married, single, and other (separated, divorced, or widowed). LN metastasis was classified into two types: yes and no. Chemotherapy, radiation, and LND are classified into two types: yes and no/unknown. The types of surgery were yes, no, and unknown. Tumor size was classified into ≤40 mm, 40 mm to 100 mm, and ≥100 mm. RSS included radiation prior to surgery (RPS), intraoperative radiation, radiation after surgery (RAS), and no/unknown. The follow-up results were divided into three conditions: alive, DCC, and DOC.

Statistical analysis
All of the statistical analyses were performed using SAS statistical software (version 9.4, SAS Institute). A univariate analysis was performed with the CIF to assess the potential prognostic contribution of each covariate, and obtain the value from Gray's test and the cumulative incidence rate at each time point. Significant covariates (P < .05) were extracted for inclusion in multivariate Cox regression, CS, and SD analyses; an SD analysis is also called Fine-Gray competing-risks regression. Multivariate Cox regression analysis was performed to identify covariates associated with an increased all-cause mortality. SCC-specific mortality was assessed CS and Fine-Gray competing-risks regression. [14] P < .05 was considered statistically significant.

Patient characteristics
A total of 5591 patients with SCC met the inclusion criteria (Table 1). At the last follow-up, 2671 patients were still alive: there were 1578 DCCs and 1342 DOCs. The median age was 48.0 years old (range = 19-98 yr old) for all patients and 50.0 years (range = 21-98 yr old) for DCCs; the corresponding median follow-up times were 47.0 months (range = 0-43 mo) and 16.0 months (range = 0-137 mo), respectively. The median age of all patients was similar to that of DCC patients, whereas the median follow-up time was shorter for DCC patients than for all patients. Overall, the incidence rates were highest among patients who were white, with no metastasis, with negative regional LNs, who underwent surgery, who received chemotherapy, who received radiotherapy, had a no/unknown RSS status, had no LND, and were married, at 73.63%, 91%, 71.74%, 54.03%, 57.99%, 68.56%, 70.41%, 58.95%, and 43.01%, respectively.

Univariate analysis
We calculated the crude CIFs for all prognostic factors. Table 2 lists the significantly influenced (P < .05) CIFs for cause-specific mortality according to Gray's test. The results showed that age, race, marital status, AJCC stage, LN metastasis, surgery, chemotherapy, radiation, RSS, LND, tumor size, and tumor grade were statistically significant in the crude CIF for cancer mortality, and so all of these factors were extracted for inclusion in the multivariate analysis. The cumulative incidence curves are plotted in Figure 2-4 (including demographic characteristics, tumor characteristics and treatment). Meanwhile, the cumulative incidence rates at 1, 3, and 5 years were calculated, and are presented in Table 2.
The three methods showed that age, metastasis, AJCC stage, surgery, chemotherapy, RSS, LND, tumor size, and tumor grade were prognostic factors affecting SCC survival. Meanwhile, the HR and 95% confidence interval (CI) values for age, marital status, metastasis, surgery, chemotherapy, RPS, RAS, and LND in the three methods were almost the same, whereas there were large differences in the values for stages Ib, IIa, IIb, III, and IV, RBAS, tumor size, grade II, and grade III between the three methods. For example, the HRs in the SD analysis for stages Ib, IIa, IIb, III, and IV, RBAS, tumor size, grade II, and grade III were very close to those in the CS analysis, while the values in the multivariate Cox regression analysis were lower. In contrast, the P values for race (Cox regression: 0.0019, SD: 0.727, CS: 0.0299) and radiation (0.005, 0.0323, and 0.0058, respectively) demonstrated that race and radiation were prognostic factors affecting survival in the Cox regression and CS analyses, which differed from the findings in the SD analysis. The P values for other marital status (Cox regression: <0.001, SD: 0.5268, CS: 0.2540) showed that being separated, divorced, or widowed was an independent prognostic factor in the Cox regression analysis, in contrast to in the CS and SD analyses. However, the results for grade IV (Cox regression: 0.2219, SD: 0.0.836, CS: 0.0417) showed that it was only statistically significant for the prognosis in the CS analysis.

Discussion
In this study, 5591 patients met the inclusion criteria. At the last follow-up, there were 1342 DOCs, constituting 45.96% of all deaths, which would be taken as censored data when using the conventional statistical analysis methods for survival data. Traditional statistical methods for analyzing the risk of disease include Kaplan-Meier survival analysis and Cox regression analysis, but these approaches can overestimate the CIF by failing to account for the competing risks of death. [17,18] The aim of using a competing-risks model is to more accurately identify prognostic factors of SCC, when censoring is absent or when censoring is present but always observed. Competingrisks models were used in hazard function regression in the CS model and the SD models, which is also called the CIF regression model or Fine-Gray model. [14] The CS model may be better suited to addressing etiological questions, since it allows the effect of covariates on the rate of occurrence of the outcome to be estimated in those subjects who are currently free of events. In contrast, the SD model may be better suited to estimating the clinical prognosis of patients, since it allows the effect of covariates on the absolute risk of the outcome to be estimated over time. [15] Latouche et al pointed out that using SD and CS models simultaneously is generally the most rigorous scientific approach to analyzing competing-risks data. [16] In the present study, the SD model, which focuses on the direct assessment of actual risks and therefore tends to assess disease risk and prognosis, seemed to be more valuable. Overall, the HR and 95% CI values for most variables were close in the SD and CS analyses, and the correlation direction was consistent, but fewer variables were inconsistent. It is clear that compared to the SD model, Cox regression analysis clearly overestimated the prognostic effect of certain covariates such as race, AJCC stage, metastasis, radiation, RSS, tumor size, and tumor grade. Some studies have found that the incidence of SCC in all age groups except 0 to 24 years old remained stable from 1993 to 2012, while for the 5-year survival rate, it was higher in whites than blacks between 1983 and 1992 (70.5% vs 58.9%). [6,7,19] During the three decades from 1983 to 2012, the relative risks for age were 1.045, 1.038, and 1.026, respectively, and those for race were 1.221, 1.249, and 1.186. Furthermore, even when accounting for stage, histology, and race, increasing age showed a worse overall survival ratio among the different stages. For young women aged 20 to 49 years old, aggressive treatment demonstrated a significant survival advantage compared with less-aggressive regimens or no treatment. In contrast, for women aged at least 50 years, aggressive treatment and less-aggressive therapy provided an obvious survival advantage over no treatment. Those studies indicated that age and race were independent negative prognostic factors for overall survival in cervical cancer using the Chi-square test and Cox regression, which led to statistical deviation to a certain extent. The results of Cox regression, SD, and CS analyses obtained in the present study indicated that age was a prognostic factor for DCC, but the HR  was higher in the Cox regression analysis than in the SD and CS analyses, at 1.018, 1.006, and 1.009, respectively. However, the multivariate Cox regression analysis indicated that race was not a prognostic factor for cause-specific mortality, which directly contrasted with the results of the SD and CS analyses. This indicates that the multivariate Cox regression analysis overestimated the effect of race despite the sample being large, which is due to it not being applicable to the competing-risks model of deletion data. [15] The results of the SD and CS analyses for the effect of race had a higher reference value. Marital status has been considered to be an independent predictor of the tumor stage at diagnosis and survival in women with cervical cancer, and its predictive efficacy has been confirmed using multivariate logistic regression models. [8] That study found that unmarried women (single, separated, divorced, or widowed) were being diagnosed more often at an advanced stage and had worse survival compared to married women in the US because they were less involved in cervical cancer screenings. Meanwhile, another study analyzed a binary logistic regression model to show that the number of single women with cervical cancer had increased significantly over the past 4 decades, especially dramatically among single women aged ≥40 years, [10] which also demonstrated that improving screening strategies might help reduce the incidence of this malignancy. The three models analyzed in the present study produced the same conclusion about the effect on the prognosis of married women compared with a single marital status. However, when compared with other marital statuses including separated, divorced, or widowed, the predictive efficacies of the SD and CS analyses differed, with the two studies above and the Cox regression analysis showing that only being married was a positive independent contributing factor.
A prognostic analysis based on the SEER database found that most malignancies of the uterine cervix in single women were the squamous cell carcinoma subtype, high grade, and involved larger tumors (>4 cm). [10] Tumor grade, tumor size, and International Federation of Gynecology and Obstetrics (FIGO) stage were associated with an increased risk of LN metastasis in the analysis of overall survival. The multivariate analysis of cause-specific survival and overall survival performed by Macdonald et al revealed that lower grade, lower FIGO stage, smaller tumor, fewer involved LNs, and a lower or zero positive LN ratio were independent predictors compared with para-aortic LN involvement, and positive LNs in cervical carcinoma predicted a prognosis that was inversely related to the number of involved LNs. [20] Colturato et al also demonstrated that LN micro-metastasis was an important risk factor for tumor recurrence. [9] More SCCs were diagnosed in younger women and were of a poor or undifferentiated grade. In contrast, more cervical adenocarcinomas presented with a well-differentiated grade and involved older women. [21] Another study confirmed that increasing FIGO stage gradually decreased the coincidence rate of the two staging methods, [22] and AJCC stage could more accurately reflect the lesion range than the FIGO stage. Furthermore, a negative LN count was an independent prognosis factor for patients with cervical cancer at each FIGO stage, and was a good supplement for evaluating the prognosis of the FIGO stage. [23] Few studies have explored the effect of AJCC stage on the prognosis of SCC compared to the FIGO stage. In our study, AJCC stage, metastasis, tumor size, grade II, and grade III were prognostic factors for SCC in the three models, and higher AJCC stage, larger tumor, LN metastasis, and higher grade (except grade IV, which was not a prognostic factor) had significantly adverse effects on SCC.
Other studies have investigated the effects of different treatments on the SCC disease risk and prognosis using traditional Cox regression analysis. Surgery and radiation were found to be common treatments for patients with cervical cancer stages IA2-IIA. Radical hysterectomy in combination with regional lymphadenectomy is the conventional treatment for these people. [12] A multivariate analysis showed that the overall survival of stage IA cervical cancer in patients younger than 50 years was significantly better among those who underwent hysterectomy and ovarian conservation compared with those who underwent oophorectomy without radiotherapy, but that the disease-specific survival was approximately the same in the two groups. [11] This indicates that surgical ovarian conservation is a positive independent prognostic factor for overall survival in the early stage among young patients with cervical cancer. However, for locally advanced cervical cancer, hysterectomy and concurrent chemoradiotherapy were standard treatments, and could improve the overall survival. [24] Another study noted that numerous regimens including hysterectomy prolonged the survival time in stage IIB patients but not in stage III patients with cervical cancer. [25] Shah et al also pointed out that more extensive lymphadenectomy had no effect of survival among those with positive LNs, but increased survival in those with negative LNs. [26] Moreover, patients with more LNs resected had a lower probability of dying from cervical cancer. A prospective study indicated that patients with cervical cancer at stages IB1-IIA1 receiving radical hysterectomy had fewer severe postoperative complications including urinary infections and/or lower limb lymphedema, and that preoperative brachytherapy was an independent risk factor for severe morbidity after surgery. [27] Several studies found that neoadjuvant chemotherapy followed by surgery significantly improved the prognosis in locally advanced cervical cancer, especially for tumors larger than 4 cm. [28][29][30] However, other studies have shown that previous research might have overestimated the treatment effect because it was unclear whether the chemotherapy doses and methods were optimal. [31,32] There have been few studies of how the order of surgery and radiotherapy affects the prognosis. The multivariate analyses of the three models performed in our study revealed that undergoing surgery, chemotherapy, and LND were positive independent prognostic factors for survival. Using those treatments could reduce the cause-specific death rate by two to three times compared to no treatment. Our study has also shown that regardless of the sequence of surgery and radiotherapy, combined treatment methods could increase the mortality rate by two to four times compared to only applying surgery or radiotherapy. Our results contrast with previous reports that undergoing radiation did not significantly affect the prognosis of the SCC.
The study was limited by using the SEER database, which does not cover all possible factors impacting the patients, instead providing only some basic information. We did not further stratify the prognostic factors based on a comprehensive impact assessment. The small sample size also affects the reliability of the statistical results. In addition, whether economical status and comorbid condition of patients are prognostic factors affecting survival in patients with SCC has not been analyzed in this study, which may consider in future studies. All these factors contribute to decreasing the reliability of the conclusions drawn.

Conclusion
A competing-risks model is a new statistical method that was used in this study to more accurately identify prognostic factors when censoring was absent or when censoring was present but always observed compared to using the conventional Cox regression analysis that has commonly been used in many previous studies. We also performed conventional Cox regression, SD,  Table 3 Multivariate analysis of prognostic factors in patients with cervical squamous cell carcinoma. AJCC = American joint committee on cancer stage, CS = cause-specific, Grade I = well differentiated, Grade II = moderately differentiated, Grade III = poorly differentiated, Grade IV = undifferentiated, HR = hazard ratio, IR = Intraoperative radiation, LND = lymph node dissection, RAS = radiation after surgery, RBAS = Radiation before and after surgery, RPS = radiation prior to surgery, RSS = radiation sequence with surgery, SD = sub-distribution, ST = survival time. Medicine and CS analyses because they had their own explanation being used as references. The SD model used in this study may be better suited to estimating the clinical prognosis of a patient, which made it possible to estimate the effect of covariates on the absolute risk of the outcome over time. Overall, the results obtained using the SD model analysis were close to those obtained in the CS analysis, but Cox regression overvalued the clinical prognosis of covariates, leading to bias in the results. We expect that the competing-risks model will be more feasible for evaluating the prognosis and guiding clinical practice in the future.